Authority Rankings from HITS, PageRank, and SALSA: Existence, Uniqueness, and Effect of Initialization

نویسندگان

  • Ayman Farahat
  • Thomas LoFaro
  • Joel C. Miller
  • Gregory Rae
  • Lesley A. Ward
چکیده

Algorithms such as Kleinberg’s HITS algorithm, the PageRank algorithm of Brin and Page, and the SALSA algorithm of Lempel and Moran use the link structure of a network of web pages to assign weights to each page in the network. The weights can then be used to rank the pages as authoritative sources. These algorithms share a common underpinning; they find a dominant eigenvector of a nonnegative matrix that describes the link structure of the given network and use the entries of this eigenvector as the page weights. We use this commonality to give a unified treatment, proving the existence of the required eigenvector for the PageRank, HITS, and SALSA algorithms, the uniqueness of the PageRank eigenvector, and the convergence of the algorithms to these eigenvectors. However, we show that the HITS and SALSA eigenvectors need not be unique. We examine how the initialization of the algorithms affects the final weightings produced. We give examples of networks that lead the HITS and SALSA algorithms to return nonunique or nonintuitive rankings. We characterize all such networks in terms of the connectivity of the related HITS authority graph. We propose a modification, Exponentiated Input to HITS, to the adjacency matrix input to the HITS algorithm. We prove that Exponentiated Input to HITS returns a unique ranking, provided that the network is weakly connected. Our examples also show that SALSA can give inconsistent hub and authority weights, due to nonuniqueness. We also mention a small modification to the SALSA initialization which makes the hub and authority weights consistent.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rank-Stability and Rank-Similarity of Web Link-Based Ranking Algorithms

The stability of Web link-based ranking algorithms was examined in recent works. Among the aspects investigated were the notions of rank stable algorithms and rank similar algorithms. Of special interest are stability results on a particular class of graphs, called authority connected graphs. This report considers three link-based ranking algorithms: PageRank, HITS and SALSA. We extend previous...

متن کامل

Improved Link-Based Algorithms for Ranking Web Pages

Several link-based algorithms, such as PageRank [19], HITS [15] and SALSA [16], have been developed to evaluate the popularity of web pages. These algorithms can be interpreted as computing the steady-state distribution of various Markov processes over web pages. The PageRank and HITS algorithms tend to over-rank tightly interlinked collections of pages, such as well-organized message boards. W...

متن کامل

HAR: Hub, Authority and Relevance Scores in Multi-Relational Data for Query Search

In this paper, we propose a framework HAR to study the hub and authority scores of objects, and the relevance scores of relations in multi-relational data for query search. The basic idea of our framework is to consider a random walk in multi-relational data, and study in such random walk, limiting probabilities of relations for relevance scores, and of objects for hub scores and authority scor...

متن کامل

Application of PageRank Model for Olympic Women’s Taekwondo Rankings: Comparison of PageRank and Accumulated Point Index System

Background. Although the World Taekwondo federation currently applies the APIS ranking method to calculate the Olympic rankings, some limitations exist. Objectives. This study applies the PageRank model to Olympics Taekwondo rankings. Methods. The 2015-2018 World Taekwondo Grand Prix competition results for women’s four weight classes (-49kg, -57kg, -67kg, +67kg) were used as research data, t...

متن کامل

A Survey of Eigenvector Methods of Web Information Retrieval

Web information retrieval is significantly more challenging than traditional well-controlled, small document collection information retrieval. One main difference between traditional information retrieval and Web information retrieval is the Web’s hyperlink structure. This structure has been exploited by several of today’s leading Web search engines, particularly Google. In this survey paper, w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • SIAM J. Scientific Computing

دوره 27  شماره 

صفحات  -

تاریخ انتشار 2006